Learning the Ontological Positions of Natural Language Objects
نویسنده
چکیده
This thesis endeavors to solve a text classification (TC) problem of a real-world system, New Brunswick Opportunities Network (NBON), an online tendering system that helps the vendors and the purchasing agents to provide and obtain information about business opportunities. The solution mainly involves techniques in the areas of machine learning and natural language processing (NLP). We use a Näıve Bayes classifier, a simple and effective machine learning approach for TC tasks, to automatically classify the tenders of the NBON system. We implement three smoothing algorithms for the Näıve Bayes classifier, namely, no-match, Laplace correction, Lidstone’s law of succession, and we show that the difference between the accuracies obtained for the three algorithms is negligible. We show that the effectiveness of the Näıve Bayes classifier is better than that of three other TC techniques that are equally simple, namely, Strong Predictors (a modification of Term Frequency), TF-IDF (Term Frequency Inverse Document Frequency), and WIDF (Weighted Inverse Document Frequency). NLP tools such as stop lists and stemmers are adopted for the text operations on the historic NBON data that is used to train the classifiers. We experiment with variations of such tools and show that NLP techniques do not have much impact on the effectiveness of a classifier.
منابع مشابه
Interpretation: The Relationship Between Language and Ontology
In this paper we are specifically interested in the relationship between natural language and the knowledge representation (KR) formalism referred to as ontology. By ontology we mean a taxonomic, hierarchical data structure. The reason we use natural language for terms in ontologies is so that we humans can understand the ontologies. Machines and humans who have to understand ontologies interpr...
متن کاملConcept Revision of Age, Motivation, and Error Correction in Second Language Learning
The current review article investigates some variables contributing to English language teaching and learning. Three factors of age, motivation and error correction have been of importance in English language curricula in language centres. Some studies have been conducted to investigate various effects of these three components on English language acquisition, those studies, however, may lack d...
متن کاملReference to numbers in natural language
A common view is that natural language treats numbers as abstract objects, with expressions like the number of planets, eight, as well as the number eight acting as referential terms referring to numbers. In this paper I will argue that this view about reference to numbers in natural language is fundamentally mistaken. A more thorough look at natural language reveals a very different view of th...
متن کاملCollaborative Tagging Approaches for Ontological Metadata in Adaptive E-Learning Systems
One of the main approaches for creating metadata for learning resources in adaptive e-learning systems has been through the use of semantic web ontologies. This approach is limiting because it doesn’t usually address a requirement for the support of annotators or the requirement for significant effort by annotators in learning ontological metadata domains and technologies. This paper proposes a...
متن کاملAnalyzing the function of Quranic language from the viewpoint of Alame Tabatabie
realm of Quranic language, which from among Alame Tabatabiechr('39')s is the most comprehensive. He believes that the Quranic language is a mixture of various languages. The language of some of the Quranchr('39')s propositions is declarative and describes objective events – both tangible and intangible; five groups of Quranic verses are as stated below: Naturalistic verses: describe natural e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003